Search results for "cosine similarity"
showing 6 items of 6 documents
An ontology-based retrieval system for mammographic reports
2015
In healthcare domain it can be useful to compare unstructured free-text clinical reports in order to enable the search for similar and/or relevant clinical cases. In data mining and text analysis tasks, the cosine similarity is usually used for texts comparison purposes. It is usually performed by computing the standard document vector cosine similarity between the two vectors representing the report pair under analysis. In this paper a novel system based on text pre-processing techniques and a modelled medical knowledge, using an improved radiological ontology, is proposed. Medical terms organized in a hierarchical tree can assess semantic similarity relationships between unstructured repo…
Particle Swarm Optimization as a New Measure of Machine Translation Efficiency
2018
The present work proposes a new approach to measuring efficiency of evolutionary algorithm-based Machine Translation. We implement some attributes of evolutionary algorithms performing cosine similarity objective function of a Particle Swarm Optimization (PSO) algorithm then, we evaluate an English text set for translation precision into the Spanish text as a simulated benchmark, and explore the backward process. Our results show that PSO algorithm can be used for translation of multiple language sentences with one identifier only, in other words the technology presented is language-pair independent. Specifically, we indicate that our cosine similarity objective function improves the veloci…
A bibliometric approach to finding fields that co-evolved with information technology
2020
Among the declining industries, for example music industry, some have been revived by information technology (IT). At the same time, in academic fields, some have expected co-evolutions between IT and other fields to cause the resurgence of either field. In this research, the clustering of citation networks with 14,438 academic papers resulted in the identification of 28 academic fields in the areas “Computer Science” or “Information Science and Library Science.” Co-evolutions between these 28 fields and citing fields to the 28 fields were evaluated by an investigation of contents; a methodology to search co-evolutions was also proposed. This paper proposes that pairs of academic fields (wi…
Monolingual and cross-lingual intent detection without training data in target languages
2021
Due to recent DNN advancements, many NLP problems can be effectively solved using transformer-based models and supervised data. Unfortunately, such data is not available in some languages. This research is based on assumptions that (1) training data can be obtained by the machine translating it from another language
Feature Dimensionality Reduction for Mammographic Report Classification
2016
The amount and the variety of available medical data coming from multiple and heterogeneous sources can inhibit analysis, manual interpretation, and use of simple data management applications. In this paper a deep overview of the principal algorithms for dimensionality reduction is carried out; moreover, the most effective techniques are applied on a dataset composed of 4461 mammographic reports is presented. The most useful medical terms are converted and represented using a TF-IDF matrix, in order to enable data mining and retrieval tasks. A series of query have been performed on the raw matrix and on the same matrix after the dimensionality reduction obtained using the most useful techni…
A Novel Multidimensional Scaling Technique for Mapping Word-Of-Mouth Discussions
2009
The techniques which utilize Multidimensional Scaling (MDS) as a fundamental statistical tool have been well developed since the late 1970’s. In this paper we show how anMDS scheme can be enhanced by incorporating into it a Stochastic Point Location (SPL) strategy (one which optimizes the former’s gradient descent learning phase) and a new Stress function. The enhanced method, referred to as MDS SPL, has been used in conjunction with a combination of the TF-IDF and Cosine Similarities on a very noisy Word-Of-Mouth (WoM) discussion set consisting of postings concerning mobile phones, yielding extremely satisfying results.